Optimizing Sponsored Search Ranking Strategy by Deep Reinforcement Learning
نویسندگان
چکیده
Sponsored search is an indispensable business model and a major revenue contributor of almost all the search engines. From the advertisers’ side, participating in ranking the search results by paying for the sponsored search advertisement to aract more awareness and purchase facilitates their commercial goal. From the users’ side, presenting personalized advertisement reecting their propensity would make their online search experience more satisfactory. Sponsored search platforms rank the advertisements by a ranking function to determine the list of advertisements to show and the charging price for the advertisers. Hence, it is crucial to nd a good ranking function which can simultaneously satisfy the platform, the users and the advertisers. Moreover, advertisements showing positions under dierent queries from dierent users may associate with advertisement candidates of dierent bid price distributions and click probability distributions, which requires the ranking functions to be optimized adaptively to the trac characteristics. In this work, we proposed a generic framework to optimize the ranking functions by deep reinforcement learning methods. e framework is composed of two parts: an oine learning part which initializes the ranking functions by learning from a simulated advertising environment, allowing adequate exploration of the ranking function parameter space without hurting the performance of the commercial platform. An online learning part which further optimizes the ranking functions by adapting to the online data distribution. Experimental results on a large-scale sponsored search platform conrm the eectiveness of the proposed method.
منابع مشابه
Web pages ranking algorithm based on reinforcement learning and user feedback
The main challenge of a search engine is ranking web documents to provide the best response to a user`s query. Despite the huge number of the extracted results for user`s query, only a small number of the first results are examined by users; therefore, the insertion of the related results in the first ranks is of great importance. In this paper, a ranking algorithm based on the reinforcement le...
متن کاملRRLUFF: Ranking function based on Reinforcement Learning using User Feedback and Web Document Features
Principal aim of a search engine is to provide the sorted results according to user’s requirements. To achieve this aim, it employs ranking methods to rank the web documents based on their significance and relevance to user query. The novelty of this paper is to provide user feedback-based ranking algorithm using reinforcement learning. The proposed algorithm is called RRLUFF, in which the rank...
متن کاملActor-critic versus direct policy search: a comparison based on sample complexity
Sample efficiency is a critical property when optimizing policy parameters for the controller of a robot. In this paper, we evaluate two state-of-the-art policy optimization algorithms. One is a recent deep reinforcement learning method based on an actor-critic algorithm, Deep Deterministic Policy Gradient (DDPG), that has been shown to perform well on various control benchmarks. The other one ...
متن کاملA Technique for Web Page Ranking by Applying Reinforcement Learning
Ranking of site pages is for showing important web pages to client inquiry it is a one of the essential issue in any web search index tool. Today’s need is to get significant data to client inquiry. Importance of web pages is depending on interest of users. There are two ranking algorithm is utilized to demonstrate the current raking framework. One is page rank and another is BM25 calculation. ...
متن کاملOperation Scheduling of MGs Based on Deep Reinforcement Learning Algorithm
: In this paper, the operation scheduling of Microgrids (MGs), including Distributed Energy Resources (DERs) and Energy Storage Systems (ESSs), is proposed using a Deep Reinforcement Learning (DRL) based approach. Due to the dynamic characteristic of the problem, it firstly is formulated as a Markov Decision Process (MDP). Next, Deep Deterministic Policy Gradient (DDPG) algorithm is presented t...
متن کامل